Active selection with label propagation for minimizing human effort in speaker annotation of TV shows

نویسندگان

  • Mateusz Budnik
  • Johann Poignant
  • Laurent Besacier
  • Georges Quénot
چکیده

In this paper an approach minimizing the human involvement in the manual annotation of speakers is presented. At each iteration a selection strategy choses the most suitable speech track for manual annotation, which is then associated with all the tracks in the cluster that contains it. The study makes use of a system that propagates the speaker track labels. This is done using a agglomerative clustering with constraints. Several different unsupervised active learning selection strategies are evaluated. Additionally, the presented approach can be used to efficiently generate sets of speech tracks for training biometric models. In this case both the length of the speech track for a given person and its purity are taken into consideration. To evaluate the system the REPERE video corpus was used. Along with the speech tracks extracted from the videos, the optical character recognition system was adapted to extract names of potential speakers. This was then used as the ’cold start’ for the selection method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Active Learning Method for Speaker Identity Annotation in Audio Recordings

Given that manual annotation of speech is an expensive and long process, we attempt in this paper to assist an annotator to perform a speaker diarization. This assistance takes place in an annotation background for a large amount of archives. We propose a method which decreases the intervention number of a human. This method corrects a diarization by taking into account the human interventions....

متن کامل

Partition sampling: an active learning selection strategy for large database annotation

Annotating a video database requires an intensive, time consuming and error prone human effort. However, this is a mandatory task to efficiently analyze multimedia contents. We propose an new selection strategy for active learning methods to minimize human effort in labeling a large database of video sequences. Formally, active learning is a process where new unlabeled samples are iteratively s...

متن کامل

Active Frame Selection for Label Propagation in Videos

Manually segmenting and labeling objects in video sequences is quite tedious, yet such annotations are valuable for learning-based approaches to object and activity recognition. While automatic label propagation can help, existing methods simply propagate annotations from arbitrarily selected frames (e.g., the first one) and so may fail to best leverage the human effort invested. We define an a...

متن کامل

Combining Active Learning and Partial Annotation for Domain Adaptation of a Japanese Dependency Parser

The machine learning-based approaches that dominate natural language processing research require massive amounts of labeled training data. Active learning has the potential to substantially reduce the human effort needed to prepare this data by allowing annotators to focus on only the most informative training examples. This paper shows that active learning can be used for domain adaptation of ...

متن کامل

Combining Active Learning and Partial Annotation for Japanese Dependency Parsing

The machine learning-based approaches that dominate natural language processing research require massive amounts of labeled training data. Active learning has the potential to substantially reduce the human effort needed to prepare this data by allowing annotators to focus on only the most informative training examples. This paper shows how active learning can be used for domain adaptation of d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014